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The Project on Academic Language Socialization (PALS) 
investigated, in a longitudinal study, the process by which urban high school 
students are socialized into subject matter discourse. Specifically, the PALS 
project examined language forms and discourse patterns that both occur in and 
constitute the process of teaching and learning in high school subject matter 
courses. This paper reports that the particular technology of data collection 
being used in the classroom has a strong effect on what is "seen" --and 
therefore collected- -in capturing and understanding classroom interaction. 

The paper aims to uncover and clarify these aspects of the process. It begins 
by examining the theory of language being followed and how it has shaped the 
processes of data collection in the classroom. The paper then describes the 
project's thoughts about and experiences with various forms of classroom data 
to illustrate how a particular way to represent the data that make up the raw 
material for analysis was arrived at. Finally, the paper presents some 
preliminary findings that illustrate the effects of taking seriously the 
relationship between theory and data collection. In this way, the project 
hopes to show how theory and technologies/methods of data acquisition are 
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Introduction: the PALS Project 

The Project on Academic Language Socialization (PALS) is investigating the process by 
which urban high school students are socialized into subject matter discourse. The project is part 
of the National Research Center on English Learning & Achievement's (CELA) efforts to 
understand and improve student literacy and learning across the grades and disciplines. 
Specifically, the PALS project is examining language forms and discourse patterns that both 
occur in and constitute the process of teaching and learning in high school subject matter classes. 
For our data set, we are videotaping four different classes twice a week at our site, which is a 
culturally diverse urban high school in a major Midwestern city. We intend to follow classes in 
science and social studies over the three years of taping. 

Although we have several more years of data collection in the longitudinal study and are thus 
not yet at the stage of providing comprehensive analyses, we consider it important at this point to 
share several aspects of our research process with others who are conducting or who are 
interested in learning about research which closely examines classroom discourse. In conducting 
our research we have found that the particular technology of data collection which we use in the 
classroom has a strong effect on what we “see” — and therefore collect — in capturing and 
understanding classroom interaction. Our research techmque developed from careful 
consideration of available technologies and methods. In the paper which follows, our purpose is 
to uncover and clarify these aspects of our process. 

We begin by examining our theory of language and how it has shaped our processes of data 
collection in the classroom. Then we describe our project's thoughts about and experiences with 
various forms of classroom data in order to illustrate how we arrived at a particular way to 
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represent the data that make up the raw material for our analyses. Finally, we present some 
preliminary findings that illustrate the effects of taking seriously the relationship between theory 
and data collection. In this way, we hope to show how theory and technologies/methods of data 
acquisition are intertwined such that a loyalty to one entails a deep examination of the other. 



A Theory of Language with Consequences for Fieldwork Practices 

From the outset, we have conceived of language as a dynamic and socially grounded practice. 
Within mainstream linguistics, language is regularly studied as an autonomous system, made up 
of abstract representations of sound, sound sequences, meaning units and their structural 
organization into acceptable strings or sentences. While the tradition of treating language as a 
structurally independent conduit for meaning is associated with brilliant advances in the 
modeling of universal features of language, it has left the work of understanding language in 
action to linguists willing to resist formalist and structuralist domination. Such researchers have 
reached across disciplinary boimdaries. Furthermore, some of the finest studies of language in 
action have been done by anthropologists, education researchers, communications scholars and 
sociologists. But even leading approaches to the fimctions of language in discourse commonly 
use metaphors of information packaging and information flow management, often ignoring 
crucial social and interactional patterns of language use (Chafe 1987, Givon 1983, Hopper 1987, 
1988). 

The approach to language which we are taking in studying multilingual and multi-ethnic 
classes is inspired by the work of sociolinguists (Gumperz 1982, 1992, Schiffrin 1987), 
ethnographers of communication (Duranti 1994, Hymes 1974), conversation analysts (Schegloff 
1992, Heritage and Roth 1995), applied linguists (Hatch 1992), as well as "dyed-in-the-wool" 
linguists (Fox 1986, Hopper 1987, 1988). We follow Paul Hopper in taking an "emergent" view 
of language. Language, by this view, is not a collection of predetermined forms and structures 
with agreed upon fimctions; it is not a static system. Rather, it is always "provisional ... not 
isolable in principle from general strategies for constructing discourses" (1988:132). 

Going beyond Hopper's conception of emergent grammar, which tends to treat fimction as 
primarily referential rather than social, we incorporate current perspectives articulated through 
the collaboration of linguists and conversation analysts. These latter researchers also pursue an 



emergent model of language, but their conception is firmly based in the natural habitat of 
language: social interaction. Such a view has been articulated in research by Ochs (1993), 

Duranti (1994), C. Goodwin (1979), Schegloff (1996), Fox (1986), Ford and Thompson (1996). 
This framework for the description of language is clearly articulated by Ochs, Schegloff and 
Thompson in their introduction to the volume Interaction and Grammar (1996). Grammar is 
taken to be "part of a broader range of resources - organizations of practices, if you will - which 
underlie the organization of social life" (1996:2). These scholars propose that [gjrammar s 
integrity and efficacy are bound up with its place in larger schemes of orgamzation of human 
conduct" (1996:3-4). For our purposes, then, language is never separate from activity (Levinson 
1992), and the activities we are concerned with, those occurring in high school science and social 
studies classes, are fundamentally social and thoroughly "co-constructed, involving the joint 
creation of . . . form, interpretation, stance, action, activity, identity, institution, skill, ideology, 
emotion, [and] other culturally meaningful reality" (Jacoby and Ochs 1995:171). 

Language is, thus, situated and jointly constituted. Furthermore, and crucially, language is 
temporally grounded and physically embodied. Early work by Harvey Sacks ([1964-1968] 1992) 
made the use of real-time recording indispensable for understanding conversation; and since the 
publication of Charles Goodwin's now classic work on the interactive coordination of gaze, 
posture, and sentence construction (1979, 1981), serious work at the intersection of language and 

interaction has demanded videotape technology. 

In the classes we are studying, it has been our explicit intention to capture as many aspects of 
the complex coordination of language and activity as possible, including, but not limited to, such 
features as pitch, prosody, posture, gaze, silence, and material artifacts (books, chalkboard, etc.). 
The assumption is that there are likely to be “seen but unnoticed” practices (Garfinkel 1967) that 
may be missed by the single viewing of an ethnographer in the field. For instance, the level of 
detail to which participants are sensitive in their interactions can be as brief as a tenth of a second 
(Goodwin 1979), far too quick for an ethnographer in the field to notice, interpret, and record 
continuously. In addition, as members of the speech communities we research, we usually 
assume that we “know” what is happening in an interaction without being able to describe the 
detail of actual events. This imperative to interpret interaction in terms of glosses provided by 
our culture is so encompassing that it can obscure the researcher’s view of how the participants 
use extraordinary and remarkable methods and practices to co-construct an event as ordinary and 
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unremarkable. Fortunately, much of the taken-for-granted fabric of our social existence can be 
exposed under repeated viewings of well-recorded material that render it in sufficient detail that 
an analyst can move closer to an account of what is actually happening, as opposed to what he or 
she assumes is happening. In our fieldwork, then, we have struggled to formulate, test, and 
reformulate our data collection procedures in order to capture this detail of class activities as it is 
co-constructed. In the next section, we describe this iterative process of matching data collection 
practices with a socially relevant theory of language. 



An Iteration of Forms of Data Collection 

In our project's consideration of data collection methods, we attempted to anticipate both the 
needs of our theoretical commitment to language and a wide variety of more practical constraints 
such as the current state of available technology and its cost, field conditions, setup time, 
staffing, quantity of data, and many others. In large part, we succeeded in matching our data 
acquisition methods and technology with our project's theoretical intent. Nevertheless, we found, 
and still find, ourselves immersed in a creative tension between the two. Obviously, theory 
shapes our methods and technology; but our methods and technology also challenge us to 
examine our analytical concerns and theoretical assumptions. New ways of acquiring data can 
make possible the discovery of previously unknown phenomena, but only if theoretical 
assumptions can evolve to accommodate such heretofore unnoticed phenomena. 

While methods and technology can enhance one's ability to see certain things (i.e., what is in 
front of the camera, what is picked up by the microphone), they can also create literal and 
figurative blindspots that must be taken into consideration. In order to avoid or minimize the 
effects of these blindspots, we have continually questioned how we go about gathering our data. 
Consider a few of the many questions we have had to answer: 

• In most of our classes we cannot back up the video camera far enough to take 
in all of the students in the classroom. What basis is there for deciding who 
gets left out? 

• If one video camera is good and two are better, would three or more be best? 

• How do we balance the need to gather detailed data using an assortment of 
potentially distracting devices with a desire to remain unobtrusive, and how 
does obtrusiveness affect our subjects' behavior? 
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• What theoretical implications are there in the simple act of positioning a 
camera, or in whether the camera shot is static or moving? 

• When we need our ethnographers to operate the cameras, what are we losing 
when they cannot take detailed field notes? 

With a large number of questions like these in mind, we have considered, tested, and 
modified several potential data collection formats before settling on a workable solution. In 
every case, we have found that the differences between what our theory expects and what a 
method can deliver are never fully resolved. Nevertheless, we have reached a powerful 
compromise that opens up a realm of phenomena that could not otherwise have even been 
imagined. What follows is a brief examination of some of the issues we have considered within 
the course of evaluating various methods of data collection. To illustrate the data collection 
decisions we have made, we will select examples from a senior-level physics class. 



Field Notes 

From the beginning, we considered field notes to be of great importance in our understanding 
of the setting in which language is used. Of course, the fundamental assumption in this method 
of data acquisition is that the ethnographer taking notes can accurately capture and represent the 
most important events that he or she observes. This requires that ethnographers be able to do a 
number of things. First, they must be able to determine, in real time, what are the important 
scenes and events in a given setting. Second, they must channel their selective attention to the 
critical scene to the exclusion of other, presumably less important scenes. And third, they must 
be able to record quickly and accurately the elements of the scene for later analysis. Figure 1 
shows an example of some handwritten field notes and a seating chart generated by one of our 
project’s ethnographers during a taping of a senior-level physics class. 
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Figure 1. Fieldworker’s seating chart and notes for interaction in a senior-level high school physics class. 

While one of the biggest constraints on ethnographers is the requirement to perform their 
work in real time, there can be significant advantages to this form of data collection. For 
instance, the ethnographer can get a "feel" for the atmosphere of the setting that can add vital 
insight to the eventual analysis of data. She can also add an interpretive dimension, in the form 
of commentary or first impressions, to the events being observed that can serve as the basis for a 
more considered analysis. In addition, this "low-tech" approach greatly simplifies the process of 
gathering data in the field. The ethnographer can simply show up and begin taking notes, or if 
unobtrusiveness is important, she can begin observing and write down her observations later. 

The point we wish to make regarding this method and the ones that follow, however, is that 
the most important consideration in adopting a form of data collection is the needs of the theory 
driving the investigation. For field notes, the obvious question is whether the research goals can 
be met using a note-taking format of data collection. In our case, we were indeed interested in 
the ethnographic setting of the classroom, but our theoretical commitments to interactional 
sociolinguistic methods also require us to capture as many of the details of interaction as 
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possible, details that provide a temporal grounding from which to examine the talk. This would 
obviously overwhelm an observer's ability to capture the detail of interest, so we knew from the 
be ginnin g that we had to supplement field notes with electronic recordings of the interaction. 



Audio Recording 

Audio recording of interaction offers a number of advantages and has — ^until recent advances 
in video technology — ^been the most popular technique of data collection for researchers 
investigating language in natural settings. This method assumes obviously that what is important 
is what can be heard and not what can be seen, which in turn reflects a theory of language that 
does not include non-verbal actions. But the emphasis on linguistic sounds is not surprising, 
given the difficulty of identifying non-linguistic sounds on the tape. By adding this method to 
ethnographic note taking, we also increase technological considerations, with matters such as 
microphone placement, tape changing, and equipment failure threatening the quality of data. In 
addition, obtrusiveness and the potential for observer effects increase as subjects are more 
cognizant that their talk is being recorded. 

On the positive side, audio recording frees the researcher by reducing the level of detail 
required in note taking; he is now free to make other kinds of observations about the interaction. 
Also, the tape recording serves as a kind of second-order approximation of actual events as they 
happened at the setting, a data source which can be examined repeatedly for features that might 
go unnoticed in a first hearing (Sacks 1992). Audio recording gives us access to the temporal 
unfolding of the interaction at hand. While it requires a bit of experience to make good 
recordings, the equipment is usually rugged, forgiving, and simple to operate. When combined 
with a good omnidirectional microphone to minimize setup considerations, the investigator can 
more or less forget the recorder and concentrate on other features of the setting. Most 
importantly, an audio recording provides a temporal grounding for the talk, one that re-presents 
the sequential structure of the participants’ talk for the analyst. Typically, the analyst will 
characterize the audio data as a text in the form of a transcript in order to make it more 
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Date: 5/6/97 




Start Time: 0:16:40 


Excerpt Length: 0:00:27 


Transcriber: CHF 


1 


TE: I have the eye anatomy (.) six dee I have 


2 




(0.5) 


3 


TE: 


some of you are turning that in today, 


4 




(0.7) 


5 


TE: 


a:nd some of you are turning in six cee today. 


6 




(1.3) 


7 


TE: 


right? thi[s is it] its got to be in today. 


8 


KI: 


[(yeah) ] 


9 


AL: 


hey but it says no- (0.2) look en ell pee= 


10 


TE: 


=right. and [the reason 


11 


AL: 


[five five ninety-seven 


12 


TE: right (.) but the reason i'm moving back a day is because 


13 




of that ay pee thing y[esterday 


14 


AL: 


[(if here) they coulda handed it in 


15 


JA: well some [of us: 


16 


KI: 


[(WHY DON’T YOU) WOMEN [SHUT UP 


17 


JA: 


[have 


18 


TE: 


[it’s oka[y 



Figure 2. Transcript of classroom audio excerpt from the same senior-level high school physics class. 



accessible. Figure 2 is an example of a transcribed portion of an audio recording from the same 
class and day that the field notes in Figure 1 were taken*. In the excerpt, the teacher (TE) is 
discussing assignments that are due before an exam when a student (AL) draws attention to a 



‘ The transcript is formatted in the conversation analytic style developed by Gail Jefferson (Atkinson and Heritage 
1984). In this style, brackets ([ or ]) indicate overlapping talk, equal signs (=) indicate talk that is latched or 
contiguous with the following turn of talk, numbers in parentheses indicate length of pause, periods (.) in parentheses 
represent micropauses of less than 0.2 seconds, a colon (:) after a letter shows a lengthening of sound, and 
underlining and capital letters are used to indicate emphasis and loudness respectively. 
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discrepancy in a particular due date. This stretch of talk will be described and analyzed in much 
greater detail in the final section of this paper. 

For some applications, however, audio recording is less than ideal. For instance, because the 
difficulty of identifying speakers increases dramatically with the number of speakers being 
recorded, audio recording was not a practical way for us to gather data in a classroom of thirty 
students. But even if such practical matters were not a problem, we would have eliminated audio 
recording as a candidate method of data gathering for theoretical reasons. As noted above, our 
theory of language directs attention to a wider variety of features than can be captured by an 
audio recording, even when combined with field notes. For us, the panoply of physically 
embodied actions is of critical interest. Silent gestures, eye gaze, facial expressions, body 
posture, movements of all kinds, and sounds that require visual identification are all potentially 
important features of the interaction in a setting (Ford, Fox & Thompson 1996). Thus, when the 
definition of language grows beyond the spoken word, audio recording fails to meet the needs of 
data collection. 

Video Recording: The Class 

Video recording the subjects in our classroom setting was an obvious choice given our 
theoretical dictates. As discussed above, we view language as an interactional achievement 
temporally grounded and physically embodied. In fact, the decision to gather data using video 
recorders was made well before we first set foot in the classroom. At first glance, video seems 
wonderfully suited to methods of qualitative analysis, and, indeed, it is a revolutionary 
technology for this purpose. But it is still a second-order approxiination of the actual event, like 
audio recording. Unlike audio, however, its "eye" suffers from unidirectionality and must be 
positioned with the needs of analysis in mind. In addition, the number of technological 
considerations rises dramatically with video. There is more equipment (e.g., tripods, microphone 
stands, cabling, and lenses), it is more complex and difficult to operate, and it is more prone to 
failure and mistakes (Goodwin 1993). 

But the advantages of video recording are compelling. Identifying speakers is made much 
easier by watching, not just the movement of lips, but the motion, gaze, and posture of 
participants. Suddenly, gesture and movement can make sense of otherwise puzzling statements. 

9 



Facial expressions and direction of gaze can uncover active participants who would otherwise 
remain invisible by their silence. Bodily posture can speak volumes about who or what is the 
center of attention. Perhaps because we are such visual creatures, a video recording also gives a 
much greater (though still artificial) sense of "being there." Given the availability of excellent 
equipment at reasonable prices, video recording is an extremely valuable tool for the researcher 
using qualitative methods. 

Because video can be such a compelling medium, however, one has to be cautious in how it 
is deployed. For instance, the highly directional "eye" of the video camera determines what is 
"seen" for purposes of analysis by later viewers. So our desire to capture as much of the 
interaction as possible has had to be balanced with the fact that the camera caimot capture all of 
the interaction in our classrooms. Once we were in the classrooms, it became clear that we could 
not back the camera up far enough to capture the entire class and the instructor. At first, we tried 
using the "pan and scan" technique familiar to anyone who has watched a theatrical release 
movie on a television screen. But we soon realized that this techniques produced two problems. 
First, the camera was always lagging the action because the operator could not predict where the 
next center of action would be located. Second, the technique introduced a tremendous 
subjective component into the recorded data as operators were forced to continually make 
instantaneous decisions about who or what was worthy of the camera's precious focus, thus 
unintentionally participating in the co-construction of classroom events as the analyst would 
eventually view them. Finally, we tried to compromise with a static shot of the class taken from 
the front of the classroom (see Figure 3). This— by necessity— excluded the instructor and 
usually a few students. Moreover, it turns out that the unseen students are those who sit at the 
periphery of the class, raising the question of whether these students might be peripheralized in 
other ways as well. 
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Figure 3. Example of a single camera video recording of the class. 

Another obvious concern is what effect being videotaped will have on the behavior of the 
students and instructor. One could easily conceive of a class or a teacher performing in ways that 
were meant to create a particular impression for the camera. In fact, both the students and the 
teachers in our data make regular reference to the fact that they are being videotaped, and some 
students blatantly "play" to the camera. But while we are sure that we are having some effect on 
our subjects, it also seems clear that the students habituate themselves to our presence and we 
become less influential over time. Moreover, our focus is on the detailed conversational 
practices within the classroom, a large part of which we assume is behavior that is beyond the 
ability of most people to alter significantly for extended periods of time. Ultimately, this is an 
issue that can be partly addressed by a comparison of our field notes taken in classes before and 
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after a camera was introduced. 



Video Recording: The Teacher 



It became painfully obvious upon viewing the tapes from cameras with a static shot of the 
students that we were missing a crucial component of the ongoing interaction: the teacher. In the 
physics class which we are using as the example for this paper, there tended to be a teacher- 
fronted organizational pattern. (However, even those classrooms which are relatively more 
student-centered present similar challenges in video recording when trying to capture the entire 
picture of the class.) Inasmuch as most of the class was focused on the disembodied voice of the 
off-screen instructor, we felt like we had lost literally half of the conversation — even though we 
still had the audio portion of all the participants' talk. By relying on this form of data collection, 
writing at the board, gesturing, facial expression, direction of attention, and noiseless activities of 
the teacher were all lost for purposes of analysis. Such absences were unacceptable given our 
commitment to a study of discursive practices and a theory that emphasized the goal of 
examining all the relevant activities of all the participants. 

At this time, we felt a strong need to try a different approach. In several previous tapings of 
classes, we had experimented with shooting from the rear of the class in order to capture the 
teacher's activities in the classroom interaction (see Figure 4). This provided a strong sense of 
what was going on in the class as far as the teacher was concerned, but it lost much of the 
students' contributions by capturing only the backs of their heads. Without a head-on view of the 
students, we were losing their displays of participation in an interaction, and they were becoming 
a faceless audience for the teacher's performance. In effect, we were privileging the teacher's 
activities as more important to the classroom interaction than the students' activities. This was a 
bias as bad as the class-only approach, and either technique could not help but negatively affect 
our eventual analyses. 



Two-Camera Video Recording: Class and Teacher 

We have mentioned how the physical limits of the classroom prevented us from backing up 
to get a better shot, but even if we were allowed limitless space to back up the camera, it would 
not have solved a more fundamental problem: no single camera can capture the head-on 
perspective of two or more participants in a face-to-face interaction. There is simply no position 
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Figure 4. Example of a single camera video recording of the teacher. 



from which the camera can pick up each participant’s complete facial expression and body 
movement as seen by the other participant. This is a crucial limitation that fmstrates any attempt 
to capture anything near the full array of interactional resources available to each of the 
participants in an interaction. 

We arrived at a solution to this dilemma: after much discussion, we decided to commit to 
using two cameras to videotape in each classroom, despite the formidable added work this would 
mean for the field workers, who needed to set up and break down their equipment rapidly to 
move from class to class. One camera would maintain a static shot and capture the faces of the 
students from the front of the classroom, while the second camera would zoom in and track the 
teacher from the back of the classroom. This way, we could capture the two-way dynamic of 
teacher- fronted classes by getting a head-on view of both the students and the teacher. 

One might ask why we stopped at two cameras, instead of using more cameras to get even 
more of the interaction. After all, a few students are still being missed by the static camera shot 
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of the classroom. But, when all was said and done, we were satisfied that we had made the most 
important jump in the quality of our data by using only two cameras: it captured most of the 
features of interactants who were facing each other, which was an essential piece of the data. 

Three cameras would not have produced the same kind of increase in data quality. Moreover, 
there are formidable obstacles to the introduction of multiple video cameras. Aside from the 
extra expense of purchasing the additional cameras (a big limitation for many researchers in this 
area), a project has to hire additional persormel to transport, set up and operate each additional 
camera and its associated equipment. In crowded classrooms, finding a place for even a single 
camera can be a challenge. And when students break from lecture to form workgroups, each 
camera must be quickly moved to avoid getting in the way and to reposition it for recording - all 
while the tape is still running. Furthermore, if the researcher values detailed field notes, it has to 
be recognized that someone operating a video camera is not going to be able to take extensive 
field notes. This is a problem we have identified for our own project and one that we are 
working to resolve. Finally, if the desire is to be an unobtrusive presence in the classroom, each 
additional camera only serves to defeat that goal. Two cameras are, for us and until further 
tec hn ological developments or greatly increased funding, the practical limit in classroom data 
acquisition, but one that allows us relatively good access to the details of interaction. 

A final socio-technological consideration lies in how we decided to analyze the two separate 
but related videotapes. We considered sequential viewings of each tape, but rejected this option 
because it defeated the intent of an interactive approach to language by disengaging the 
participants in the interaction. We thought of using two video monitors and starting the tapes at 
the same time, but, aside from the significant difficulty of getting two VCRs to start at exactly 
the same time, we realized that the equipment demands (two monitors, two VCRs) were 
excessive and unlikely to be available for purposes of presentation. We finally found out about 
and settled on the picture-in-picture (PIP) technique for combining video images. In this 
technique, the two tapes are synchronized (using a video signal mixer) and one of the images is 
compressed and superimposed in a comer of the larger, main-screen, image. As there are more 
subjects and details in the classroom view and because it is a static shot, we make that the larger, 
main-screen image and "pip" a compressed image of the teacher into the upper left or right comer 
of the main-screen image (see Figure 5). Because the camera on the teacher is usually zoomed- 
in, this partly compensates for the reduced image size of the view. 




Figure 5. Example of a “pipped” two-camera video recording of the class and teacher. 



"Pipping" the two camera images allows us to impart an impressive artificial perspective to 
the interaction in the classroom. As analysts, we can now see both participants head-on in a way 
that did not exist for the participants or anyone else and would not exist for us but for a bit of 
technological manipulation. It takes some getting used to in order to coordinate the two images 
in one's mind and so take advantage of such a unique view, but the wealth of data thus available 
is striking, especially when compared to alternative methods of data gathering and rendering. 

So what might constitute an example of something more that can be seen as a result of this 
technique? While there are many points of interest in even the short excerpt we have used above 
(Figure 2), we will now focus on a brief moment of that excerpt in which there is tight 
coordination between the talk, gesture, and gaze of the student and teacher. Using the integration 
of dual perspectives provided by pipped video, we are able to uncover a remarkable moment of 
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shared coordination in the co-construction and use of a public object of literacy , an overhead 
projection of class assignments. 



An Example of Analysis with Two-Camera Video Data 

In this section we will carry out a detailed description and analysis of a moment of 
interactional synchrony in which a student and teacher carry out actions (i.e., pointing and 
turning to look) that display an acute intersubjective awareness of each other and the task in 
which they are engaged. The moment is notable both as an example of the level of detail that is 
relevant to classroom interaction and as an example of the kind of interaction that is only 
available to researchers through a pipped two-camera technique of data acquisition. The excerpt, 
which is part of the example referred to in Figures 1-5 above, is taken from a senior-level physics 
class at an urban high school in the Midwest. In the excerpt (see the transcript in Figure 2), the 
teacher (TE) is going over assignments and their due dates using a transparency on an overhead 
projector. Although it is not visible in the videotape, the information on the transparency is laid 
out in a grid-like table format, similar (but not identical) to the representation in Figure 6 . 



^ In this context, “object of literacy” is meant to refer to a physical resource (e.g., book, magazine, newspaper, road 
sign, restaurant menu, etc.) that contains a written form of the language, and which participants can use in performing 
a variety of actions. 

^ The table in Figure 6 was taken from the same class but at an earlier date. Unfortunately, the particular table 
referred to in this excerpt was not available. 
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Figure 6. Example of an assignment sheet similar to the one referred to in the excerpt. 
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While the teacher is referring to the assignments and emphasizing their due dates, a student 
(AL) objects to the apparent extension of the due date for assignment “6C” because of an “AP 
thing.” To support this objection, the student directs the teacher’s attention to the notation 
“NLP,” which stands for “No Late Papers,” next to the due date for the paper on the overhead. 
While this represents, for the teacher, a minor adjustment of a deadline to accommodate some 
students who spent class time taking advanced placement tests, it also appears as an unfair 
extension of the due date that was not available to those students who were not taking the tests. 
Thus, a heated discussion between students ensues and the teacher attempts to quell it. This 
moment of interest for our analysis occurs between lines 5 and 13 of the transcript shown in 
Figure 2 and is reproduced below. 

[ISClSSIEa] 



5 


TE: 


a:nd some of you are turning in six cee today. 


6 




(L3) 


7 


TE: 


right? thi[s is it] its got to be in today. 


8 


KI: 


[(yeah) ] 


9 


AL: 


hey but it says no- (0.2) look en ell pee= 


10 


TE: 


=right. and [the reason 


11 


AL: 


[five five ninety-seven 


12 


TE: 


right (.) but the reason I'm moving back a day is 


13 




because of that ay pee thing y[esterday 



In particular, line 9 contains an especially fluid moment of mutual body coordination between 
AL and TE — visible only with a pipped videotape of the two cameras — with respect to the 
overhead projection screen. Let us first describe the talk and associated movement, then we can 
begin to examine and discuss how this coordination is achieved. 

During the 0.2 second pause in AL’s talk, AL begins to raise his hand. Simultaneously with 
the raising of AL’s hand, TE turns to look at the screen so that he is actually looking fully at the 
screen by the time AL finishes raising his hand. Rather than the pointing occurring first and then 
the redirection of eye gaze — as might be presumed by an ethnographer watching the interaction 
in the classroom — the actions are contemporaneous. Note the movement of AL’s hand and TE’s 
gaze direction in Figures 7-10, which illustrate the sequence of action just before and during this 
0.2 second time frame. We now consider this moment in more detail in order to see how this 
coordination is achieved. 
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Figure 7. Time 1: Just before AL says “no-” 



in line 9, AL with hands down and TE looking at AL. 




Figure 8. Time 2: During AL’s “no-” 



in line 9, AL begins to raise left hand and TE still looking at AL, 




19 



23 



BEST COP¥ AVAILABLE 




Figure 9. Time 3: During 0.2 second pause in line 9. AL begins to thrust pointing hand toward overhead screen 
and TE midway through turning head toward screen. 




Figure 10. Time 4: During 0.2 second pause in line 9. AL finishes pointing hand toward overhead screen 
and TE finished turning head toward screen. 
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With the beginning of the word “six” (line 5) TE begins to rotate his head toward the class. 

By the end of “cee” his face is toward the class and he finishes the movement with a head nod 
that coincides with a voiced emphasis on “cee.” The intonation of “today” (line 5) has a strong 
falling (ending) intonation. TE maintains his body orientation throughout the 1.3 second pause 
(line 6). At the end of the pause, and coinciding with his rising (questioning) intonation “right” 
(line 7), TE opens up his left hand, which is directed toward a location on the overhead screen. 

TE nods his head throughout “right? This is it” (line 7). 

During AL’s “hey but it says” (line 9), TE turns his gaze toward AL while maintaining body 
and arm position (Figure 7). With the beginning of “no-” (line 9) AL begins to raise his left hand 
(Figure 8). By the end of the 0.2 sec pause, AL’s left hand is pointing toward the front of the 
class, presumably at the overhead screen (Figures 9 & 10). Throughout the 0.2 sec pause and 
finishing with the final thrusting part of AL’s pointing motion, TE begins to rotate his head from 
facing AL back toward the overhead screen. When AL’s hand is in its final position pointing 
toward the front of the class, TE’s head has finished rotating and his gaze is re-directing to some 
area on the overhead projection screen. AL’s pointing and TE’s head movement all take place 
within the 0.2 sec pause (line 9). As AL begins to say “look” (line 9) TE’s gaze is already 
directed to a point on the screen. During “look en ell” (line 9) TE’s gaze appears to scan up and 
down and across the screen. By the end of “ell” TE’s gaze rests on a spot to the right of where he 
is currently pointing. From the start of “pee” (line 9) TE begins moving his body toward the spot 
on the right-hand side of the screen where his gaze is directed. As he moves his body he 
continues to hold his arm in the same pointing attitude so that the spot to which he is pointing 

moves across the screen. 

As AL overlaps TE’s “the reason” (line 10) with “five five ninety-seven” (line 1 1) TE turns 
his gaze from the screen toward AL, while continuing to move to the right-hand side of the 
screen. At the end of AL’s turn (line 1 1), TE simultaneously reaches the point to which he is 
moving so that his still-pointing hand is now resting over the destination position. TE then says 
“right” (line 12) and nods his head. He also begins turning his gaze toward AL and raising his 
right hand in a pointing gesture toward AL. During this, AL has been pointing continuously 
toward a location at the front of the classroom, but as TE begins to say “reason” (line 12) AL 
begins to drop his arm back down. By the time TE finishes “back a” (line 12) AL’s arm is 
completely lowered and he has hunched forward in his desk. In this brief sequence, the 
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redirection of TE’s gaze is simultaneous with the act of pointing by AL that is intended to 
accomplish the redirection. Under close examination, the fluid linkage between the two actions 
is striking and it almost appears as if the actions, pointing and gaze redirection, were carried out 
in practiced unison. With this in mind, what are some of the ways in which the interactants may 
have laid the grounds for the achievement of such an act of synchrony? 

To begin with, there is evidence that the overhead projection screen (with the assigmnent 
information displayed on it) has been established as a kind of object of literacy available for 
public use and scrutiny. TE’s reference to it by pointing as he is reading information and the 
students’ gaze direction displays that it is a readable resource held in common by both the teacher 
and the students during the interaction. Actions in this immediate context can therefore be taken 
as potentially relevant with respect to it, though, as we shall see, some work may still need to be 
done in order to foreground its relevance. 

Next, TE displays a prospective awareness that something about this assignment (“six cee”) 
is potentially problematic. His long (1 .3 sec.) pause after his announcement that some students 
“are turning in six cee today” provides ample space for a turn on the part of any student to 
question or respond to this statement. Failing to receive a response from a student, TE begins 
again Avith a questioning “right?” that serves to emphasize the previous turn and its lack of 
uptake by the students, which in turn marks it as potentially noteworthy in some respect. Finally, 
TE reiterates the turning in of the assignment in line 7, after which he receives a response from 
AL. That AL does in fact treat TE’s utterance as noteworthy or problematic can be seen in the 
exclamatory “hey” and the contrastive “but” in the beginning of AL’s response. 

While his utterance may acknowledge the problematic nature of TE’s previous turn, when AL 
begins his turn with “hey but it says no. . . ,” he also creates a potential problem of reference with 
regard to the topic of the current talk. In the context of the immediately preceding interaction 
and the topics of previous turns, the pronoun “it” could refer to a variety of previous noun phrase 
referents such as the assignment (“its got to be in today”) or the due date for the assignment 
(“this is it”), or it could refer to a contextually established possibility such as the overhead 
projection screen. Admittedly, once the verb “say” is delivered the due date is precluded as a 
possible candidate of reference; “say” refers to an object of literacy, something that can “say” 
something, such as the written assignment or the text on the projection screen. Nevertheless, the 
exact reference remains ambiguous. 
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In continuing his utterance with . . no. . . AL may be either self-repairing the previous 
part of his utterance or, more likely, beginning the phrase “no late papers,” an expansion and 
interpretation of the acronymic notation “NLP” added on the line of information about 
assignment 6C displayed on the overhead screen. But during AL’s talk, TE has been looking at 
AL, not at the overhead screen. In this way, TE is providing visible evidence that he is perhaps 
not attending to the screen as the potential object of reference to AL’s “it.” If AL’s utterance at 
line 9 was designed to direct TE’s gaze to the overhead screen as the possible object of literacy 
that could be “saying” something, it does not succeed. 

In the context of AL’s lack of success in redirecting TE’s gaze verbally, raising his hand 
during the 0.2 sec pause in line 9 can be seen as a deictic gesture designed to repair this failure. 
That this gesture can effectively resolve any potential ambiguity of reference is displayed by TE’s 
simultaneous redirection of his gaze toward the screen before AL has even completed his gesture. 
By the time AL bolsters his deixis with the directive “look . . . ,” TE is already looking at the 
screen. Having successfully established the overhead screen as the object of reference, AL then 
switches to a literal description (“en ell pee”) of what is in fact now visible to TE, AL, and the 
rest of the class: the line on the overhead that contains the due date for assignment 6C also has 
the notation “NLP,” which stands for “no late papers” and which directly contradicts TE’s 
statement that assignment 6C is due “today.” AL’s move from an interpreted view of the 
assignment details found in “it says no ((possibly: late papers))” to a literal view of the overhead 
screen expressed in “look en ell pee ((NLP))” is thus integral to the overall process of redirecting 
TE’s gaze in the disambiguation of the reference to “it.” 

This establishment of reference finally allows AL to press his disagreement vn\h TE’s change 
of due date for the assignment by using the grounds of legitimacy found in the information on the 
overhead screen. TE then acknowledges the legitimacy of AL’s disagreement by the lack of 
pause preceding his latched, or immediately adjacent, follow-on agreement (“=nght,’ at line 10) 
and his swift entry into an account (“..and the reason”) for the apparent discrepancy. AL 
responds to TE’s assent to account for the discrepancy by lowering his hand. 
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Summary of Analysis 



This instance of mutual coordination in which the teacher’s gaze is successfully redirected 
toward the overhead projection screen simultaneously with a student’s pointing to the screen is 
made possible through a number of practices engaged in by the participants. First, the teacher’s 
talk displayed that he anticipated some kind of potential trouble with his statement about the 
assignment due date, thus making talk about the assignment relevant. Second, through both the 
teacher’s and the students’ visible reference to the overhead projection screen, it was co- 
constructed as a contextually available referent both for and in the talk of the classroom. 

There are several additional points that we wish to emphasize before concluding. First, note 
that this moment of coordinated orientation to a textual object could only have been captured for 
such a detailed einalysis by using the enhanced data acquisition techniques described in this 
article. For example, the field notes for this class (Figure 1) make no mention of this exchange 
between the teacher and the student, and the audio transcript excludes any trace of the embodied 
activities of the interactants. Even the camera views of the student and the teacher, taken 
individually, fail to show the tight physical coordination of teacher and student as they 
collaboratively construct the overhead transparency as a relevant textual object and interact with 
and through it. The ability to “see” both the student and the teacher in their fleeting synchrony is 
a unique artifact constructed through technological processes available to the researcher. 



Conclusion 

Our report has ended with an extended micro-analysis of a tightly coordinated moment of 
interaction. This analysis, and the details of coordination and synchrony, would not have been 
available had we not first worked through the methodological and technological options 
described in this paper to arrive at the picture-in-picture two-camera solution. Thus, advanced 
technologies of data acquisition such as video recording should not be treated as mere gimmicks 
to be added to a study as a trendy afterthought to qualitative research on language; they should be 
considered as integral parts of the overall theoretical and methodological intent of the research 
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project, and they should be given corresponding attention in the planning stages of the study. 
Moreover, the limitations of the technology should also be given serious consideration. Camera 
positioniiig will affect what the researcher takes as an approximation of what actually happened 
in an interaction, and the resolution of video (and audio) technology is never sufficient to resolve 
all uncertainties about who said or did what. It is therefore better to be aware of these limitations 
than let them shape one's analysis in uncertain ways. In this way and others, researchers should 
be willing to experiment and constantly compare their methods and technology with their 
theoretical commitments. For while technology never eliminates the need for sensitive, 
inquisitive, skilled, and dedicated researchers in the field, its intelligent application can open up a 
whole new world to the researcher of language and interaction. 

In some ways, the current use of videotaped interaction is reminiscent of van Leeuwenhoek’s 
application of his invention, the microscope, to the life sciences. In doing so, he effected a 
revolution in biology by making visible the previously unseen details of an unimagined world. 

As scientists of language and interaction, one of our goals is to similarly extend our view, 
through the use of video and other technologies, into the details of how students and teachers 
manage their skillful work of co-constructing the reality of classroom life. Over time, these 
microanalyses in English and other classes can help us better understand how these classroom 
interactions contribue to student understanding and achievement. 
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